Goto

Collaborating Authors

 indian accent


Teleperformance uses AI to 'neutralize' Indian accents among staff

The Japan Times

Teleperformance, the largest call-center operator in the world, is rolling out an artificial intelligence system that softens English-speaking Indian workers' accents in real time in a move the company claims will make them more understandable. The technology, called accent translation, coupled with background noise cancellation, is being deployed in call centers in India, where workers provide customer support to some of Teleperformance's international clients. Teleperformance provides outsourced customer support and content moderation to global companies including Apple, ByteDance's TikTok and Samsung Electronics. "When you have an Indian agent on the line, sometimes it's hard to hear, to understand," Deputy-CEO Thomas Mackenbrock said in an interview with Bloomberg. The technology can "neutralize the accent of the Indian speaker with zero latency," he said.


Clustering and Mining Accented Speech for Inclusive and Fair Speech Recognition

arXiv.org Artificial Intelligence

Modern automatic speech recognition (ASR) systems are typically trained on more than tens of thousands hours of speech data, which is one of the main factors for their great success. However, the distribution of such data is typically biased towards common accents or typical speech patterns. As a result, those systems often poorly perform on atypical accented speech. In this paper, we present accent clustering and mining schemes for fair speech recognition systems which can perform equally well on under-represented accented speech. For accent recognition, we applied three schemes to overcome limited size of supervised accent data: supervised or unsupervised pre-training, distributionally robust optimization (DRO) and unsupervised clustering. Three schemes can significantly improve the accent recognition model especially for unbalanced and small accented speech. Fine-tuning ASR on the mined Indian accent speech using the proposed supervised or unsupervised clustering schemes showed 10.0% and 5.3% relative improvements compared to fine-tuning on the randomly sampled speech, respectively.


Svarah: Evaluating English ASR Systems on Indian Accents

arXiv.org Artificial Intelligence

India is the second largest English-speaking country in the world with a speaker base of roughly 130 million. Thus, it is imperative that automatic speech recognition (ASR) systems for English should be evaluated on Indian accents. Unfortunately, Indian speakers find a very poor representation in existing English ASR benchmarks such as LibriSpeech, Switchboard, Speech Accent Archive, etc. In this work, we address this gap by creating Svarah, a benchmark that contains 9.6 hours of transcribed English audio from 117 speakers across 65 geographic locations throughout India, resulting in a diverse range of accents. Svarah comprises both read speech and spontaneous conversational data, covering various domains, such as history, culture, tourism, etc., ensuring a diverse vocabulary. We evaluate 6 open source ASR models and 2 commercial ASR systems on Svarah and show that there is clear scope for improvement on Indian accents. Svarah as well as all our code will be publicly available.


New artificial intelligence program can remove accents from voices

#artificialintelligence

A startup company based out of Silicon Valley called Sanas plans to use artificial intelligence (AI) to modify the voices of workers in call centers to remove their accents. The company's demo features the voice of a man with an Indian accent, reading through a call center script in a simulated customer interaction. Enabling the slider on screen to use Sanas' technology seamlessly switches from obviously human audio to a processed version that finds itself in the uncanny valley. The voice is still noticeably synthesized, but the Indian accent is gone and replaced with a more Americanized or "white" accent. Sanas launched in August 2021 and has already received large amounts of funding, with $32 million in funding secured during a Series A funding round in June 2022. The company's founders, three former students of Stanford University, claim the funding is the largest amount ever put towards a speech technology service.


Best Automation Solutions: Time to take Healthtech to Hospitals

#artificialintelligence

We are in a grave situation where our healthcare infrastructure is under intense pain. We have tried all means to tackle the growing needs of the people but the available infrastructure is not good enough for the same. In these limited resources what can help us is the right kind of "Automation". Automation not only in the process but also in the technology which handles patients. Product Brief: Best Kiosk provides a self-service channel for patients to register, check in for consultations, book appointments, and make payments.


Voice Is the Next Big Platform, Unless You Have an Accent

#artificialintelligence

My mother waited two months for her Amazon Echo to arrive. Then, she waited again -- leaving it in the box until I came to help her install it. Any device that requires vocal instructions makes my mother skeptical. She has bad memories of Siri. "She could not understand me," my mom told me.


Voice Is the Next Big Platform, Unless You Have an Accent

#artificialintelligence

My mother waited two months for her Amazon Echo to arrive. Then, she waited again -- leaving it in the box until I came to help her install it. Any device that requires vocal instructions makes my mother skeptical. She has bad memories of Siri. "She could not understand me," my mom told me.